Search CORE

3,153 research outputs found

On the Provision of a Comprehensive Computer Graphics Education in the Context of Computer Games

Author: Anderson Eike F.
Peters CE
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 29/03/2009
Field of study

Position paper for the ACM SIGGRAPH/Eurographics Computer Graphics Education Workshop 200

Bournemouth University Research Online

Model-Based Reinforcement Learning with Continuous States and Actions

Author: Deisenroth MP
Peters J
Rasmussen CE
Publication venue
Publication date: 01/01/2008
Field of study

Finding an optimal policy in a reinforcement learning (RL) framework with continuous state and action spaces is challenging. Approximate solutions are often inevitable. GPDP is an approximate dynamic programming algorithm based on Gaussian process (GP) models for the value functions. In this paper, we extend GPDP to the case of unknown transition dynamics. After building a GP model for the transition dynamics, we apply GPDP to this model and determine a continuous-valued policy in the entire state space. We apply the resulting controller to the underpowered pendulum swing up. Moreover, we compare our results on this RL task to a nearly optimal discrete DP solution in a fully known environment

CiteSeerX

TUbiblio

Spiral - Imperial College Digital Repository

MPG.PuRe

Approximate Dynamic Programming with Gaussian Processes

Author: Deisenroth MP
Peters J
Rasmussen CE
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

In general, it is difficult to determine an optimal closed-loop policy in nonlinear control problems with continuous-valued state and control domains. Hence, approximations are often inevitable. The standard method of discretizing states and controls suffers from the curse of dimensionality and strongly depends on the chosen temporal sampling rate. In this paper, we introduce Gaussian process dynamic programming (GPDP) and determine an approximate globally optimal closed-loop policy. In GPDP, value functions in the Bellman recursion of the dynamic programming algorithm are modeled using Gaussian processes. GPDP returns an optimal statefeedback for a finite set of states. Based on these outcomes, we learn a possibly discontinuous closed-loop policy on the entire state space by switching between two independently trained Gaussian processes. A binary classifier selects one Gaussian process to predict the optimal control signal. We show that GPDP is able to yield an almost optimal solution to an LQ problem using few sample points. Moreover, we successfully apply GPDP to the underpowered pendulum swing up, a complex nonlinear control problem

TUbiblio

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

MPG.PuRe

Recommended from our members

An autoradiographic study of the projections from the lateral geniculate body of the rat.

Author: Peters A
Ribak CE
Publication venue: eScholarship, University of California
Publication date: 01/07/1975
Field of study

The projections from the lateral geniculate body of the rat were followed using the technique of autoradiography after injections of [3H] proline into the dorsal and/or ventral nuclei of this diencephalic structure. Autoradiographs were prepared from either frozen or paraffin coronal sections through the rat brain. The dorsal nucleus of the lateral geniculate projected via the optic radiation to area 17 of the cerebral cortex. There was also a slight extension of label into the zones of transition between areas 17, 18 and 18a. The distribution of silver grains in the various layers of the cerebral cortex was analyzed quantitatively and showed a major peak of labeling in layer IV with minor peaks in outer layer I and the upper half and lowest part of layer VI. The significance of these peaks is discussed in respect to the distribution of geniculocortical terminals in other mammalian species. The ventral nucleus of the lateral geniculate body had 5 major projections to brain stem structures both ipsilateral and contralateral to the injected nucleus. There were two dorsomedial projections: (1) a projection to the superior colliculus which terminated mainly in the medial third of the stratum opticum, and (2) a large projection via the superior thalamic radiation which terminated in the ipsilateral pretectal area; a continuation of this projection passed through the posterior commissure to attain the contralateral pretectal area. The three ventromedial projections involved: (1) a geniculopontine tract which coursed through the basis pedunculi and the lateral lemniscus to terminate in the dorsomedial and dorsolateral parts of the pons after giving terminals to the lateral terminal nucleus of the accessory optic tract, (2) a projection via Meynert's commissure to the suprachiasmatic nuclei of both sides of the brain stem as well as to the contralateral ventral lateral geniculate nucleus and lateral terminal nucleus of the accessory optic tract, and (3) a medial projection to the ipsilateral zona incerta. The results obtained in these experiments are contrasted with other data on the rat's central visual connections to illustrate the importance of these connections in many subcortical visual functions

eScholarship - University of California

PIPPS: Flexible model-based policy search robust to the curse of chaos

Author: Doya K
Parmas P
Peters J
Rasmussen CE
Publication venue: 35th International Conference on Machine Learning, ICML 2018
Publication date: 01/01/2018
Field of study

Previously, the exploding gradient problem has been explained to be central in deep learning and model-based reinforcement learning, because it causes numerical issues and instability in optimization. Our experiments in model-based reinforcement learning imply that the problem is not just a numerical issue, but it may be caused by a fundamental chaos-like nature of long chains of nonlinear computations. Not only do the magnitudes of the gradients become large, the direction of the gradients becomes essentially random. We show that reparameterization gradients suffer from the problem, while likelihood ratio gradients are robust. Using our insights, we develop a model-based policy search framework, Probabilistic Inference for Particle-Based Policy Search (PIPPS), which is easily extensible, and allows for almost arbitrary models and policies, while simultaneously matching the performance of previous data-efficient learning algorithms. Finally, we invent the total propagation algorithm, which efficiently computes a union over all pathwise derivative depths during a single backwards pass, automatically giving greater weight to estimators with lower variance, sometimes improving over reparameterization gradients by 10^6 times

arXiv.org e-Print Archive

TUbiblio

Apollo (Cambridge)

MPG.PuRe

CUED - Cambridge University Engineering Department

On the Validity of Isotropic Complex α-Stable Interference Models for Interference in the IoT

Author: Clavier Laurent
Egan Malcolm
Gorce Jean-Marie
Peters Gareth,
ZHENG Ce,
Publication venue: HAL CCSD
Publication date: 26/08/2019
Field of study

International audienc

Manifold Gaussian Processes for regression

Author: Calandra R
Deisenroth MP
Peters J
Rasmussen CE
Publication venue: Proceedings of the International Joint Conference on Neural Networks
Publication date: 01/01/2016
Field of study

Off-the-shelf Gaussian Process (GP) covariance functions encode smoothness assumptions on the structure of the function to be modeled. To model complex and nondifferentiable functions, these smoothness assumptions are often too restrictive. One way to alleviate this limitation is to find a different representation of the data by introducing a feature space. This feature space is often learned in an unsupervised way, which might lead to data representations that are not useful for the overall regression task. In this paper, we propose Manifold Gaussian Processes, a novel supervised method that jointly learns a transformation of the data into a feature space and a GP regression from the feature space to observed space. The Manifold GP is a full GP and allows to learn data representations, which are useful for the overall regression task. As a proof-of-concept, we evaluate our approach on complex non-smooth functions where standard GPs perform poorly, such as step functions and robotics tasks with contacts.The research leading to these results has received funding from the European Council under grant agreement #600716 (CoDyCo - FP7/2007–2013). M. P. Deisenroth was supported by a Google Faculty Research Award.This is the accepted manuscript. It is currently embargoed pending publication

arXiv.org e-Print Archive

TUbiblio

Crossref

UCL Discovery

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

CUED - Cambridge University Engineering Department

MPG.PuRe

Initial fixation placement in face images is driven by top-down guidance

Author: A Yarbus
AA Ghazanfar
AA Ghazanfar
CE Thomsen
CF Keating
D Noton
D Parkhurst
FKD Nahm
JM Henderson
K Guo
K Guo
K Guo
KM Gothard
Kun Guo
LA Parr
NJ Emery
R Rauschenberger
RH Carpenter
RJ Peters
SK Mannan
SR Langton
T Farroni
V Bruce
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2007
Field of study

The eyes are often inspected first and for longer period during face exploration. To examine whether this saliency of the eye region at the early stage of face inspection is attributed to its local structure properties or to the knowledge of its essence in facial communication, in this study we investigated the pattern of eye movements produced by rhesus monkeys (Macaca mulatta) as they free viewed images of monkey faces. Eye positions were recorded accurately using implanted eye coils, while images of original faces, faces with scrambled eyes, and scrambled faces except for the eyes were presented on a computer screen. The eye region in the scrambled faces attracted the same proportion of viewing time and fixations as it did in the original faces, even the scrambled eyes attracted substantial proportion of viewing time and fixations. Furthermore, the monkeys often made the first saccade towards to the location of the eyes regardless of image content. Our results suggest that the initial fixation placement in faces is driven predominantly by ‘top-down’ or internal factors, such as the prior knowledge of the location of “eyes” within the context of a face

University of Lincoln Institutional Repository

Crossref

Impulsive Multivariate Interference Models for IoT Networks

Author: Clavier Laurent
Egan Malcolm
Gorce Jean-Marie
Peters Gareth,
Zheng Ce,
Publication venue: HAL CCSD
Publication date: 06/04/2020
Field of study

Device density in wireless internet of things (IoT) networks is now rapidly increasing and is expected to continue in the coming years. As a consequence, interference is a crucial limiting factor on network performance. This is true for all protocols operating on ISM bands (such as SigFox and LoRa) and licensed bands (such as NB-IoT). In this paper, with the aim of improving system design, we study the statistics of the interference due to devices in IoT networks; particularly those exploiting NB-IoT. Existing theoretical and experimental works have suggested that interference on each subband is well-modeled by impulsive noise, such as α-stable noise. If these devices operate on multiple partially overlapping resource blocks-which is an option standardized in NB-IoT-complex statistical dependence between interference on each subband is introduced. To characterize the multivariate statistics of interference on multiple subbands, we develop a new model based on copula theory and demonstrate that it effectively captures both the marginal α-stable model and the dependence structure induced by overlapping resource blocks. We also develop a low complexity estimation procedure tailored to our interference model, which means that the copula model can often be expressed in terms of standard network parameters without significant delays for calibration. We then apply our interference model in order to optimize receiver design, which provides a tractable means of outperforming existing methods for a wide range of network parameters

INRIA a CCSD electronic archive server

HAL Descartes

On the Validity of Isotropic Complex α-Stable Interference Models for Interference in the IoT

Author: Clavier Laurent
Egan Malcolm
Gorce Jean-Marie
Peters Gareth,
Zheng Ce,
Publication venue: HAL CCSD
Publication date: 26/08/2019
Field of study

International audienc

INRIA a CCSD electronic archive server

Hal-Diderot